Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis
نویسندگان
چکیده
We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words) in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners) applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.
منابع مشابه
Improving Sentiment Analysis Through Ensemble Learning of Meta-level Features
In this research, the well-known microblogging site, Twitter, was used for a sentiment analysis investigation. We propose an ensemble learning approach based on the meta-level features of seven existing lexicon resources for automated polarity sentiment classification. The ensemble employs four base learners (a Two-Class Support Vector Machine, a Two-Class Bayes Point Machine, a Two-Class Logis...
متن کاملA High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملSentiment Lexicon Expansion Based on Neural PU Learning, Double Dictionary Lookup, and Polarity Association
Although many sentiment lexicons in different languages exist, most are not comprehensive. In a recent sentiment analysis application, we used a large Chinese sentiment lexicon and found that it missed a large number of sentiment words used in social media. This prompted us to make a new attempt to study sentiment lexicon expansion. This paper first formulates the problem as a PU learning probl...
متن کاملCross-ratio uninorms as an effective aggregation mechanism in sentiment analysis
There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexi...
متن کاملA Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Entropy
دوره 18 شماره
صفحات -
تاریخ انتشار 2016